Goto

Collaborating Authors

 momentum 0




2f2b265625d76a6704b08093c652fd79-Supplemental.pdf

Neural Information Processing Systems

A central challenge in training classification models in the real-world federated system is learning with non-IID data. To cope with this, most of the existing works involve enforcing regularization in local optimization or improving the model aggregation scheme at the server.





A Appendix

Neural Information Processing Systems

However, these methods were ineffective in our experiment. As we explain in Section 3.2, our test-time augmentation space consists of 12 operations. Figure 4 shows a selected data sample and its augmented versions. PIL.ImageEnhance.Sharpness function gives a blurred image with parameters less The original image was distorted by some corruptions, such as rotation and noise. We use up-to 64 nodes to parallelize data-generating process.


Boosted CV aR Classification (Supplementary Material)

Neural Information Processing Systems

On the COMP AS dataset, we use a three-layer feed-forward neural network activated by ReLU as the classification model. For optimization we use momentum SGD with learning rate 0.01 and The batch size is 128. On the CelebA dataset, we use a ResNet18 as the classification model. The remaining 45000 training samples consist the training set. The batch size is 128.


af5baf594e9197b43c9f26f17b205e5b-Supplemental.pdf

Neural Information Processing Systems

Supplementary Material (Appendix) When Are Solutions Connected in Deep Networks? Hence, (15) holds and the desired claim follows. Thus, by using again assumption (A1), we can apply Corollary A.1 of [ Thus, the desired claim follows from Theorem 4.1. Note that we apply Corollary A.1 of [ Thus, the 2nd condition follows from assumption (A1), and the application of Corollary A.1 is justified. Let us assume w.l.o.g. that This shows that the set of features formed by these neurons is linearly separable.


Mysterious EfficientNets

#artificialintelligence

The research paper Rethinking Model Scaling for Convolutional Neural Networks introduces a family of EfficientNet architecture. In the paper, they systematically study model scaling and identify that carefully balancing network depth, width and resolution can lead to better results. The paper introduces a compound scaling method that uniformly scales all dimensions of depth, width and resolution. I feel that it has covered a missing piece in designing CNN architectures. The paper states that the key idea is to balance all dimensions and it can be achieved by scaling each of them with constant ratio.